Sound Wave Cancellation Through Bone and Air 1 Running Head: SOUND WAVE CANCELLATION THROUGH BONE AND AIR Toward Adapting Spatial Audio Displays For Use With Bone Conduction: The Cancellation of Bone-conducted and Air-conducted Sound Waves
نویسنده
چکیده
Virtual three-dimensional (3D) auditory displays utilize signal-processing techniques to alter sounds presented through headphones so that they seem to originate from specific spatial locations. A full function of shift values could be used to adapt virtual 3D auditory displays for use with bone-conduction headsets (bonephones). This study provided anchor points for a function of shift values. The shift values were established by having participants adjust phase and amplitude of two waves in order to cancel out the signal and thus produce silence. These adjustments occurred in a listening environment consisting of air-conducted and bone-conducted tones, as well as air-conducted masking. Performance in the calibration condition suggested that participants understood the task, and could do this task with reasonable accuracy. In the bone-to-air listening conditions, the data produced a clear set of anchor points for an amplitude shift function. The data did not reveal, however, anchor points for a phase shift function – the data for phase were highly variable and inconsistent. Application of shifts, as well as future research to establish full functions and better understand phase are discussed, in addition to validation and followup studies. Sound Wave Cancellation Through Bone and Air 3 Toward Adapting Spatial Audio Displays For Use With Bone Conduction: The Interaction of Bone-conducted Waves and Air-conducted Waves There are a variety of reasons for using sound to convey information to a listener. These include conveying speech signals, as well as conveying information to a person who’s eyes are busy or to a person who is visually impaired. Regardless of the application, auditory stimuli (sounds) are typically presented to a listener through air, using loudspeakers or headphones. Headphones allow private presentation of highfidelity dichotic (stereo) sounds to a listener, without the perception changing as a person moves and turns, all in a portable package. On the other hand, there are problems with headphones. Covering the ears with headphones deteriorates detection and localization of ambient sounds in the environment. Those external sounds may be of particular interest in augmented reality and tactical situations, as well as for visually impaired users who rely on environmental audio cues as their primary sense of orientation. Furthermore, headphones do not allow auditory display to occur simultaneously with most types of hearing protection. These situations would benefit from an alternative to headphones. Because the auditory system is also sensitive to pressure waves transmitted through the bones in the skull (Békésy, 1960; Tonndorff, 1972), bone conduction may lead to an acceptable solution. Although bone conduction of sounds occurs naturally in listening to one’s own voice and to loud external sounds, it can also be directly transmitted through the skull via mechanical transducers. Presenting auditory information to listeners through bone conduction by placing vibrators on the skull can afford the same privacy and perceptual constancy that standard headphones offer, yet leave the ear canal and pinna uncovered. This may facilitate improvement in the detection and localization of Sound Wave Cancellation Through Bone and Air 4 environmental sounds, and allows the display of auditory information even when hearing protection is inserted into the ear canal. Bone-conduction devices also cater to the preferences of users who would rather not have their ears occluded (Walker, Stanley, & Lindsay, 2005b). The use of bone conduction transducers to deliver sound is not new. Because bone-conducted sound bypasses the middle ear and directly stimulates the cochlea, bone conduction is typically used in clinical audiology settings to assess the locus of hearing damage in patients. Developed for such clinical purposes, most bone-conduction transducers in production are not suitable for use in an auditory display: They typically consist of a single transducer, which is bulky and requires special equipment to drive it. Recently, compact binaural bone-conduction headsets have become available. Due to their potential for stereo presentation of sounds, their small size, comfort, and standardized input jack, these “bonephones” are much more suitable for implementation in auditory displays. The transducers of the very latest bonephones rest on the mastoid, which is the raised portion of the temporal bone located directly behind the pinna. The mastoid is a preferable transducer location relative to the forehead or temple (used in previous bone-conduction devices) because it contains the inner ear, is relatively immune to the interference associated with muscle tissue operating the jaw, and allows dichotic presentation of sounds. Bone Conduction and Spatial Audio Most of the psychoacoustics research and virtually all of the human factors research on auditory displays has assumed the conduction of sound through air (i.e., from speakers or headphones), and thus has overlooked the alternative acoustic pathway of Sound Wave Cancellation Through Bone and Air 5 bone conduction. Because sound design guidelines established for air conduction will not necessarily apply to bone conduction, auditory display design needs to be re-evaluated for bone conduction. One type of auditory display that requires extensive research to implement with bone-conducted audio is a virtual three-dimensional (3D) auditory display. In this type of display, sounds are typically presented through headphones, after being processed to make them sound like they are originating from specific spatial locations outside the head (i.e., they are “spatialized”). Virtual 3D audio displays have gained recent popularity, due to their ability to increase detectability of signals amidst distracters and noise (e.g., Brungart & Simpson, 2002), as well as provide orientation cues in cases of vision loss (e.g., Walker & Lindsay, 2006). Spatializing audio signals for virtual 3D auditory displays is a complex process, based on considerable psychophysical research investigating how to manipulate acoustic cues to produce a reliable percept of sounds originating from different locations (see Blauert, 1983). Because spatialized audio is typically delivered through air-conduction via traditional headphones (which cover the ears), the perception of environmental sounds is deteriorated. As a result, a tradeoff must occur between hearing spatialized audio and hearing external sounds when using regular headphones. This tradeoff is a problem when spatialized audio and sounds in the environment are both important for the user’s task, such as with audio navigation systems like the System for Wearable Audio Navigation (SWAN) (Walker & Lindsay, 2006). The SWAN is just one example of a system that could benefit from presentation of 3D audio via bone conduction. However, there is little research on whether bonephones can effectively replace headphones for the display of spatial audio, and how the audio would need to be processed to produce virtual sound source locations. Sound Wave Cancellation Through Bone and Air 6 One approach to evaluating the potential effectiveness of bonephones for spatial audio is to just replace headphones with a pair of bonephones, use standard spatial audio filters developed for headphones, and just see how well people can perform the spatial audio task. Although this approach has shown that bonephones can produce some degree of spatial audio (Walker & Lindsay, 2005), higher performance and greater perceptual fidelity may be achieved if the processing applied to sounds for spatialization is customized for the bonephones. A substantial difference in optimal acoustic parameters between air and bone conduction is likely, given the very different mechanisms through which those pathways transmit sound to the cochlea. Air-conducted signals are filtered by the pinna and ear canal, as well as by the workings of the tympanic membrane (eardrum), and the ossicles in the middle ear. The ossicles connect to the cochlea at the oval window; this results in standing waves on the basilar membrane, which are converted into neural impulses by the hair cells (Sekuler, 2002). For the bone-conducted signal, however, the majority of the perception results from the signal traveling directly through the skull and shaking the cochlea to set up standing waves on the basilar membrane (Békésy, 1960; Yost, 1994). Further, the bone-conduction pathway does not need to accomplish the impedance-matching that air-conduction does (Tonndorff, 1972). The goal of the research presented in this document is to begin to identify techniques for processing sounds presented through headphones can be customized for bonephones. With this information, spatial audio displays can be tuned to be more effective with bonephones. Sound Wave Cancellation Through Bone and Air 7 Relevant Research: Threshold Measurement and Related Discussion A formal evaluation of using bonephones to present spatial audio requires understanding the basic properties of the bone-conduction hearing mechanisms, spatial audio cues, and how the virtual 3D audio is created. As with air-conduction hearing, some basic information about thresholds of audibility has been gathered for bone conduction. In particular, most research on bone conduction has focused on establishing threshold norms for clinical testing of middle ear disorders. This clinical research has yielded threshold curves such as those shown in Figure 1. The methods and implications of this research can inform the design of research aimed at using bonephones in spatial audio displays. * values estimated from graph Figure 1. Bone-conduction thresholds from several researchers. The y-axis is in units that are the bone-conducted equivalent to the decibel measurement used for air-conduction (see text). These thresholds are used to define “normal” hearing in order to screen people for whether they have a middle ear disorder. Sound Wave Cancellation Through Bone and Air 8 The y-axis units (dB) in Figure 1 are not exactly the same as the units used to describe hearing thresholds (i.e., dB SPL), because bone-conduction intensity levels cannot be measured by simply placing a sound level meter with a microphone up to the transducer. Rather, a standardized mechanical coupler that simulates the impedance of a human mastoid, an “artificial mastoid,” picks up the vibration from the bone-conduction transducer. Within the artificial mastoid, the vibration is picked up by piezoelectric discs, which convert the vibration into a voltage that can be measured by the electronics in the sound level meter. Decibels are a ratio between two intensities, with the measured intensity in the numerator and a reference intensity in the denominator. The ratio of voltages from the artificial mastoid creates a decibel metric for bone conduction, just as a ratio of voltages sent from the microphone creates a decibel metric for air conduction. This makes it possible to directly compare decibel values between bone conduction and air conduction. How those values match up depends greatly on the reference intensity chosen. Long before standardized bone-conduction thresholds for clinical purposes were developed, Georg von Békésy (1960) completed some of the initial investigations into hearing through bone conduction. In addition to thresholds, Békésy’s investigations included a wide variety of related topics: the specification of the nodes formed in the skull when vibrated, measurement of the linearity of sound transmission through skin, the resonant frequency of the ossicles, and the speed of sound through the skull. For threshold measurement, Békésy had listeners alter the phase and amplitude of waveforms until they cancelled each other out and produced silence. Specifically, listeners adjusted the phase and amplitude of a pure-tone wave presented through air-conduction until it Sound Wave Cancellation Through Bone and Air 9 cancelled out a static bone-conducted signal from a vibrator on the forehead. The change in amplitude needed to cancel out the wave in air was then taken as the threshold value for bone conduction. The phase adjustments were not reported in his publications (at least not the ones written in English). The focus of Békésy’s interpretation of his findings was that cancellation could be done between air and bone, which suggested that air and bone conduction shared similar mechanisms, at least at some level. Some modern threshold research has been more focused on applications to auditory displays. Specifically, the threshold curve for the bonephones has been plotted under a variety of listening conditions: Walker and Stanley (2005) conducted an applied assessment of how much relative power needs to be driven into bonephones for a listener to hear a sound at a variety of frequencies in various practically relevant listening conditions. Figure 2 shows the relative intensities of sounds sent to the bonephones for the listener to detect, for each frequency and listening condition. Note that the top of the y-axis is 0 dB attenuation, which represents the maximum intensity sound. As the position on the y-axis descends from this maximum, the magnitude of the attenuation increases. Thus, lower points on the y-axis indicate that a quieter sound could be detected. Also note that because sound intensity was specified at the level of the input into the bonephones, this threshold curve represents the combined frequency response for both the bone-conduction hearing mechanisms and the bonephones device. These sensitivity specifications are useful because they can be used to optimize audio for the bonephones under the various listening conditions. These curves suggest that for equal detection performance, low and high frequencies need to be more intense than middle frequencies (i.e., 570 – 1850 Hz). Indeed, subjective listening experience with Sound Wave Cancellation Through Bone and Air 10 unaltered sound played through the bonephones suggests that low-frequency sounds are typically not loud enough and the midrange frequencies sounds are too loud. Essentially, a different equalization setting is needed, due to the differences between air and bone conduction. These differences include disparities in the auditory mechanism through which sound travels as well as the physical properties of the device used to deliver the audio. A description of the relative intensities sent to the bonephones in order to detect a signal (Walker & Stanley, 2005) is helpful in understanding purposeful spectral changes Figure 2. Threshold of Audibility: Masked, Open, and Plugged. Shown are threshold curves measured by Walker & Stanley (2005) with bonephones when ears were open, plugged, or masked with 45 dB pink noise. On the y-axis, lower position represents better sensitivity (i.e., detection of softer sounds). The attenuation specifies the input into the bonephones, where more attenuation means a less intense sound sent to the bonephones could be detected. The error bars represent one standard error above and below the mean. Sound Wave Cancellation Through Bone and Air 11 that can be made to sounds as part of processing them for spatialization. That research was the first in a potential series of investigations that could lead to a detailed description of the signal processing that needs to be applied for the spatialization of sounds played through the bonephones. Acoustic Cues For Spatial Separation The two acoustic cues producing the perceptual experience of lateralization 1 are interaural level differences (ILDs) and interaural time differences (ITDs) (Yost & Hafter, 1987). In order to implement spatial audio with the bonephones, sensitivity to these basic spatial audio cues must exist. Until recently, many researchers have assumed that spatial audio with bone conduction is not possible, because the interaural attenuation, and thus the maximum ILD, was not considered sufficient (Blauert, 1983; Goldstein & Newman, 1994). On the other hand, Audiology handbooks indicate that bone-conducted interaural attenuation (BC IA) may be greater than zero, and as much as 20 dB, though audiologists often assume its lower bound estimate of zero dB (e.g., Katz, 2002). There have been few investigations into BC IA, and these have been inconclusive (e.g., Liden, Nilsson, and Anderson, 1959; Hood, 1960). The language of these resources suggests that the “worstcase scenario” is more important than what the empirical evidence alone reveals. This conservatively-biased estimate of BC IA is appropriate for clinical purposes where erring on the side of caution is preferred. For the purposes of adapting spatial audio filters for 1 Lateralization involves space in only one dimension (spatial separation within the head), whereas spatialization involves space in three dimensions (invoking the percept of a sound outside the head and in the vertical dimension). Lateralization is a logical first step toward 3D audio via bonephones, because if lateralization is not possible, then spatialization is not possible. Sound Wave Cancellation Through Bone and Air 12 air-conduction so that they are suitable for bonephones, however, a neutral approach is more suitable. New information about sensitivity to interaural differences delivered through bone conduction gives a different perspective than typical audiology guidelines on the level of BC IA. In a direct assessment of sensitivity to interaural differences, Kaga, Setou, and Nakamura (2001) found that the subjective report of image lateralization systematically depended on interaural differences delivered through binaural application of clinical bone-conduction vibrators. The researchers showed sensitivity to ILDs and ITDs in children with normal hearing, as well as in children with abnormalities of the middle and outer ears. Furthermore, in participants with normal hearing, these sensitivities were not significantly different from ITDs and ILDs assessed through airconduction (See Table 1 for threshold values). The air-conduction interaural thresholds found by Kaga et al. (2001) are higher than many other estimates by psychoacoustics researchers. The reason for the Table 1 ILD and ITD Discrimination Thresholds for Otologically Normal Children, from Kaga et
منابع مشابه
Bone conduction in a three-dimensional model of the cochlea.
Hearing sensations are caused by air- and bone-guided sound. Of course, other biological materials like tendons, muscles and tissue are also involved during conduction of sound. To study the influence of bone conduction, a formerly developed finite element model was excited by harmonic pressure signals at the cochlea wall. The clinical finding during middle ear surgery, namely the increase in b...
متن کاملAcoustic and physiologic aspects of bone conduction hearing.
Bone conduction (BC) is the way sound energy is transmitted by the skull bones to the cochlea causing a sound perception. Even if the BC sound transmission involves several pathways including sound pressure induced in the ear canal, inertial forces acting on the middle ear ossicles and cochlear fluids, alteration of the cochlear space, and pressure transmission through the 3rd window of the coc...
متن کاملHearing Protection for Bone-Conducted Sound
The high-noise environments in and around modern military jet aircraft can impair voice communications and cause permanent damage to the hearing of pilots and maintenance crews if adequate hearing protection is not worn. With external noise levels of some aircraft approaching 150 dBA, adequate hearing protection must provide better than 50 dB of attenuation. In order to reach such a high level ...
متن کاملIntelligibility of bone-conducted speech at different locations compared to air-conducted speech
Bone-conduction transducers offer a unique advantage for radio communication systems, allowing sound transmission while the ear canals remain open for access to environmental sounds, or plugged for blocking of environmental sounds. This study compared the intelligibility of noise-degraded speech presented through bone-conduction hearing administered at different locations, and through air-condu...
متن کاملLoudness functions with air and bone conduction stimulation in normal-hearing subjects using a categorical loudness scaling procedure.
In a previous study (Stenfelt and Håkansson, 2002) a loudness balance test between bone conducted (BC) sound and air conducted (AC) sound was performed at frequencies between 0.25 and 4 kHz and at levels corresponding to 30-80 dB HL. The main outcome of that study was that for maintaining equal loudness, the level increase of sound with BC stimulation was less than that of AC stimulation with a...
متن کامل